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1.  SUMMARY  OF  RESEARCH  PROGRESS  AND  RESULTS 


During  the  two  years  supported  by  this  grant,  we  have  made  significant  progress 
both  in  areas  we  proposed  to  investigate  and  in  related  areas.  In  this  section,  we 
summarize  the  progress  in  those  areas  that  have  resulted  in  publications. 


1.1.  Stochastic  Control  of  Markov  Processes. 

We  have  continued  our  research  program  in  adaptive  estimation  and  control  prob¬ 
lems  for  stochastic  systems  involving  either  incomplete  (or  noisy)  observations  of 
the  state.  The  first  class  of  problems  we  have  been  studying  involves  finite  state 
Markov  chains  with  incomplete  state  observations  and  unknown  parameters;  in  par¬ 
ticular,  we  have  studied  certain  classes  of  quality  control,  replacement,  and  repair 
problems.  We  have  in  the  past  considered  a  quality  control  problem  in  which  a 
system,  such  as  a  manufacturing  unit  or  computer  communications  network,  can 
be  in  either  of  two  states:  good  or  bad.  A  finite  set  of  control  actions  are  available 
to  the  decision-maker.  Under  these  actions,  the  system  is  either  subject  to  Mar¬ 
kovian  deterioration,  or  is  restored  to  the  good  state.  The  problem  is  modeled  as 
a  partially  observed  Markov  decision  process  (POMDP).  Furthermore,  we  assume 
that  deterioration  of  the  system  depends  on  an  unknown  parameter,  namely  the 
probability  of  the  state  going  from  the  good  to  the  bad  state  in  one  time  epoch. 

The  adaptive  stochastic  control  problem  for  this  class  of  systems  is  fairly  difficult, 
because  the  presence  of  feedback  causes  the  system  transitions  to  depend  on  the 
parameter  estimates  and  introduces  discontinuities.  In  [5]  and  [7],  we  have  analyzed 
algorithms  for  this  quality  control  problem,  and  also  presented  a  general  framework 
for  the  study  of  optimality  of  adaptive  policies.  Using  the  ODE  method,  we  show 
that  two  algorithms,  one  based  on  maximum  likelihood  and  another  based  on  pre¬ 
diction  error,  converge  almost  surely  to  the  true  parameter  value.  In  addition,  we 
modify  the  method  of  Shwartz  and  Makowski  to  prove  optimality  of  the  resulting 
certainty  equivalent  adaptive  policy,  assuming  only  the  existence  of  some  sequence  'or 
of  parameter  estimates  converging  almost  surely  to  the  true  parameter  value.  Again, 
the  discontinuities  and  partial  observations  in  this  problem  preclude  the  direct  use  i 
of  previously  existing  methods,  but  we  have  been  able  to  generalize  the  method  to  °* 

problems  such  as  this.  Also,  we  have  avoided  the  very  strong  standard  assumption _ 

that  the  parameter  estimates  converge  almost  surely  to  the  true  parameter  value  . 
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under  any  stationary  policy.  In  more  general  directions,  convexity  properties  were 
explored  in  [4]  and  [8]. 

A  project  in  surveying  the  literature  on  the  ergodic  control  problem  for  discrete¬ 
time  control  Markov  processes  was  completed  [3].  This  was  a  major  effort  which  puts 
together  a  comprehensive  account  of  the  considerable  research  on  this  problem  over 
the  past  three  decades.  Our  exposition  ranges  from  finite  to  Borel  state  and  action 
spaces,  and  includes  a  variety  of  methodologies  to  find  and  characterize  optimal 
policies.  We  have  included  a  brief  historical  perspective  of  the  research  efforts  in 
this  area  and  have  compiled  a  substantial  bibliography.  In  the  process  we  have 
identified  several  important  questions  which  are  still  left  open  to  investigation. 

We  embarked  on  writing  a  research  monograph  entitled  “Ergodic  Control  of 
Markov  Chains  and  Stochastic  Games,”  intended  for  publication  as  a  volume  in  the 
series  “Applications  of  Mathematics”  by  Springer- Verlag.  The  principal  investigator 
spent  a  two-month  period  as  a  visiting  faculty  at  the  Indian  Institute  of  Science, 
Bangalore,  where  a  major  portion  of  this  effort  was  completed. 

Some  interesting  results  on  the  vanishing  discount  method  for  partially  observed 
Markov  chains  were  obtained  in  [11].  The  vanishing  discount  method  is  crucial  in 
establishing  solutions  for  the  average  cost  optimality  equation  in  controlled  Markov 
chains.  In  our  work  we  make  use  of  generalized  limits  of  functions  to  extend  the 
results  of  Platzman  and  Ross.  We  study  the  cases  of  a  finite  state  space  with 
compact  actions  as  well  as  countable  state  space  with  finite  actions. 

1.2.  Hybrid  Stochastic  Systems. 

A  major  part  of  our  efforts  was  devoted  to  the  study  of  hybrid  stochastic  systems. 
This  was  motivated  from  control  problems  of  systems  exhibiting  multiple  modes  or 
failure  modes,  including  the  hierarchical  control  of  flexible  manufacturing  systems. 
A  flexible  manufacturing  system  (FMS)  consists  of  a  number  of  workstations,  with 
each  workstation  having  a  set  of  identical  machines.  The  model  used  involves  a 
hybrid  process  in  continuous  time  whose  state  is  given  by  a  pair  (X(t),  S(t)).  Here 
X(t)  denotes  the  downstream  buffer  stock  of  parts,  which  may  have  a  negative  value 
to  indicate  a  backlogged  demand.  The  continuous  component  X(t)  is  governed  by  a 
controlled  diffusion  process  with  a  drift  vector  which  depends  on  the  discrete  com¬ 
ponent  S(t).  Thus,  X(t)  switches  from  one  diffusion  path  to  another  as  the  discrete 
component  S(t)  jumps  from  one  state  to  another.  On  the  other  hand,  the  discrete 
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component  S(t),  denoting  the  number  of  operational  machines,  is  influenced  by  the 
inventory  size  and  production  scheduling,  and  can  also  be  controlled  by  various 
decisions  such  as  produce,  repair,  replace,  etc.  Hence,  S(t)  evolves  as  a  “controlled 
Markov  chain”  with  a  transition  matrix  depending  on  the  continuous  component. 
This  model  motivates  the  study  of  a  stochastic  optimization  problem  in  a  more 
abstract  setting  which  is  manifested  in  numerous  other  situations.  For  example,  it 
is  encountered  in  a  hybrid  model  proposed  for  the  study  of  dynamic  phenomena 
in  large  scale  interconnected  power  networks,  in  macroeconomic  problems  and  in 
dynamic  renewal  problems  in  general.  Our  treatment  of  the  optimization  problem 
[2],  [9]  was  based  on  a  convex  analytic  approach,  which  is  interesting  in  its  own  right 
and  would  be  more  flexible  and  powerful  for  certain  other  purposes,  e.g.,  the  path- 
wise  average  cost  problem  or  problems  with  several  constraints,  where  the  analytic 
approach  does  not  seem  to  be  amenable. 

Also,  the  study  of  the  ergodic  cost  problem  led  to  a  number  of  very  significant 
results  in  [6],  [10],  [12]  and  [13].  We  have  analyzed  the  optimal  control  of  switching 
diffusions  with  pathwise  average  cost  criterion.  Under  certain  conditions  we  have 
established  the  existence  of  a  stable  Markov  nonrandomized  policy  which  is  a.s. 
optimal  in  the  class  of  all  admissible  policies.  Also,  the  existence  of  a  unique  solution 
of  the  associated  HJB  equations  is  established  in  a  certain  class,  and  the  optimal 
policy  is  characterized  as  a  minimizing  selector  of  an  appropriate  Hamiltonian.  We 
have  applied  our  results  to  a  manufacturing  model  and  have  obtained  an  optimal 
production  policy  which  is  of  hedging  point  type.  By  studying  the  recurrence  and 
ergodic  properties  of  switching  diffusions  we  have  obtained  two  new  results  in  partial 
differential  equations  viz.  the  maximum  principle  and  Harnack’s  inequality  for  a 
uniformly  elliptic  system. 

In  [15],  we  study  a  parameterized  linear  system  perturbed  by  white  noise.  The 
parameters  are  randomly  switching  from  one  state  to  the  other  and  are  modeled  as 
a  finite  state  Markov  chain;  the  values  of  the  parameter  and  the  state  of  the  linear 
system  are  assumed  to  be  known  to  the  controller.  The  cost  function  is  quadratic. 
Under  certain  conditions,  we  find  a  linear  feedback  control  which  is  almost  surely 
optimal  for  the  pathwise  average  cost  over  the  infinite  planning  horizon. 

In  [14]  we  investigate  the  weak  and  strong  controllability  of  a  class  of  stochastic 
systems,  with  a  bounded  Lipschitzean  nonlinearity.  The  concepts  of  weak  and  strong 
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controllability  are  natural  generalizations  of  nondegeneracy  and  positive  recurrency 
—  concepts  well  known  in  the  theory  of  stochastic  processes.  Our  results  extend 
those  known  for  linear  problems  and  are  stated  in  terms  of  verifiable  conditions. 

1.3.  Stochastic  Approximations. 

The  Ordinary  Differential  Equation  (ODE)  method  is  one  of  the  most  powerful 
tools  for  the  study  of  convergence  of  stochastic  approximations.  The  objective  in 
this  method  is  to  associate  to  a  given  algorithm  an  “averaged”  system  described 
by  a  differential  equation,  through  which  the  asymptotic  behavior  of  the  algorithm 
can  be  investigated.  Quite  often,  the  stochastic  problem  is  such  that  the  associated 
ODE  has  a  discontinuous  right  hand  side,  rendering  the  analysis  problematic.  This 
situation  is  not  adequately  covered  in  the  existing  literature  on  the  ODE  method. 
The  main  reason  that  discontinuous  dynamics  are  not  treated  in  the  stochastic 
averaging  literature  is  due  to  the  limitations  inherited  by  the  desire  to  apply  the 
Ascoli-Arzela  Theorem  in  the  Picard-type  iterations  of  the  shifted  piecewise-linear 
interpolants  of  the  process.  In  [16]  we  obtain  convergence  results  for  the  case  of  a 
Markovian  noise  with  countable  state.  Both  state-independent  and  state-dependent 
noises  are  considered. 

1.4.  Nonlinear  Systems. 

In  the  area  of  nonlinear  deterministic  systems  we  have  investigated  numerical 
issues  of  approximate  linearization.  Approximate  linearization  of  nonlinear  systems 
becomes  important  for  systems  where  the  nonlinearities  are  severe  enough  that  exact 
linearization  fails.  An  approximate  method  that  linearizes  the  system  up  to  a  certain 
order  was  originally  proposed  by  A.  Krener.  Since  less  restrictive  conditions  are 
required  for  approximate  linearization,  this  technique  offers  the  means  of  enlarging 
the  class  of  nonlinear  systems  to  which  linearizing  techniques  are  applicable.  In  [1] 
we  studied  the  problem  via  differential  forms.  We  show  that  this  approach  results 
in  substantial  computational  savings.  Furthermore  the  method  is  constructive  and 
offers  a  simple  solution  to  the  problem. 

1.5.  Other  Related  Research. 

We  have  started  working  on  a  particular  state-estimation  problem  involving  inter¬ 
connected  power  systems.  The  objective  here  is  to  develop  a  systematic  procedure 
for  identifying  the  best  measurement  points  to  be  used  in  estimating  the  location 
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of  harmonic  sources  in  power  systems-  Harmonic  distortion  in  power-distribution 
systems  is  reaching  detrimental  levels  and  causes  problems  such  as  overheating  and 
failure  of  equipment,  malfunction  of  protective  equipment,  nuisance  tripping  of  sen¬ 
sitive  loads,  and  interference  with  communication  networks.  Prom  an  analytical 
point  of  view,  we  posed  the  problem  of  selecting  the  measurement  locations  that 
will  minimize  the  expected  value  of  the  sum  of  squares  of  differences  between  esti¬ 
mated  and  true  parameter  variables.  The  key  features  of  our  contribution  to  the 
problem  so  far,  are  a)  a  model  describing  how  the  measurements  are  related  to  the 
variables  to  be  estimated,  (b)  the  criteria  to  be  used  in  the  estimation  process,  (c)  a 
mathematical  model  of  the  uncertainties  present  in  the  problem,  and  (d)  structural 
properties  of  the  best  set  of  measurements.  Among  the  findings,  thus  far,  we  have 
been  able  to  demonstrate  why  capacitor  busses  normally  serve  as  the  best  locations 
for  instrument  placement.  In  addition  we  developed  a  simple  sequential  procedure 
for  identifying  the  best  measurement  points  and  have  shown  through  examples  that 
it  is  nearly-optimal  [17].  Finally,  various  generic  network  geometries  have  been 
studied  analytically,  and  the  model  has  been  augmented  to  take  into  account  the 
effect  of  switching  capacitors  and  other  uncertainties. 
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